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the standard form of Jarzynski's equahty and the previous protocol postprocessing 
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I. INTRODUCTION 



A key goal in computational thermodynamics is the estimation of free energy differences 
between equilibrium states. Challenges in efficiently obtaining accurate values, however, 
continue to motivate the development of novel methods.- The discovery of new theorems in 
nonequilibrium statistical mechanics^"- have opened up a promising direction: free energy 
calculations based on simulations of driven nonequilibrium processes.-i^ The most straight- 
forward implementation of this approach involves performing multiple repetitions of a pro- 
cess in which a system is driven out of equilibrium by switching an external parameter A 
according to a protocol A = A(t), where < t < T. If the free energy difference of interest 
is between thermodynamic states defined by setting A to A and B, then the protocol is 
defined so that A and B are the end states, A(0) = A and A(T) = B. The free energy 
difference between the initial thermodynamic state and the equilibrium state at any time t, 
= F{\(t)) — F(A(0)), may then computed by applying,-i^ 



where (. . . )a is an average over all possible trajectories (realizations of the process), and Wt 
denotes the work done on the system up to time t during a particular trajectory, Zn- For a 
finite sample of trajectories Zn for n = 1, 2, N, the sample mean provides an estimator 
for this expectation. 

Unfortunately, this estimator often suffers from poor convergence. The expected number 
of realizations needed to obtain a reliable estimate of grows rapidly with the dissipation, 
(W^t)A — F\^, that invariably accompanies driven nonequilibrium processes.-"— In turn, dis- 
sipation reflects the lag that develops between the state of the system and the equilibrium 
state corresponding to the instantaneous value of the external parameter.— (See Fig[T]). 
This lag is ultimately responsible for the poor convergence of Eq. [1] 

The connection between lag and the convergence of the free energy estimator can be 
better understood by considering two limiting cases. First, consider the case of an infinitely 
slow reversible process. As the system remains in equilibrium throughout the process, there 
is no lag. In this case, convergence only requires a single sample because the work performed 
along any isothermal quasi-static trajectory, Wt, is equal to the free energy difference, F\^.— 




(1) 



n=l 
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The opposite limit is that of an infinitely fast process, in which Eq. [T] reduces to the more 
familiar free energy perturbation identity. Free energy estimates based on this identity 
converge quickly only if there is significant overlap in the important phase space regions of 
the end stateSj-^^^ which in turn refiects the lag. Likewise, in the intermediate situation of 
a finite-time process, the convergence of free energy estimates depends on overlap between 
the sampled phase space and the important phase space of an equilibrium state of interest. 

In order to reduce lag and improve convergence, strategies such as importance sampling 
of trajectories^"— and "escorted" free energy simulations^i have been introduced. (For 
a brief overview, see Ref.— .) In this paper, we consider an alternate strategy, protocol 
postprocessing. This strategy involves introducing a function A* = X*(t) with A*(0) = A(0), 
which we will refer to as the analysis protocol. The central result of this paper (Eq. [T3|) 
is an expression for the free energy difference F\* = F{X*{t)) — F(A(0)) using trajectories 
generated in the original process (in which the work parameter is switched according to 
the protocol A(t)). While this result is valid for any choice of A*(t) and reduces to Eq. [1] 
for X*(t) = X(t), we will argue that Eq. [13] provides efficient estimates of the free energy 
difference Fa* whenever the equilibrium densities corresponding to the analysis protocol 
X*{t) have a high degree of overlap with density of the system (See Fig[T]). 

Protocol postprocessing was previous introduced^i in the context of importance sampling 
in path-space.—"— In the present work, we utilize an alternate mathematical formalism, the 
Feynman-Kac theorem.— This formalism is similar to that used in the escorted free energy 
simulation method,— and indeed, the two methodologies may be used in conjunction with 
one another. The new formalism has at least two advantages over the previous method: 
first, in certain special cases, it is analytically a zero- variance estimator. Secondly, for a 
few simple model systems, we find that the bias and variance of free energy estimates are 
substantially reduced. 

II. FEYNMAN-KAC FORMALISM 

The derivation of Jarzynski's equality^*^ from the Feynman-Kac theorem has been well- 
documented.—"— In this section, we recapitulate Hummer and Szabo's derivations^ and 
extend it to protocol postprocessing. 
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A. Jarzynski's equality 



Consider a classical system with a Hamiltonian, H\{z) = H{z]X), that depends on 
its position in dimensional phase (or configuration) space, z, and a parameter vector, 
A. The system evolves according to dynamics which, if the temperature (3~^ and A are 
held constant, preserve the canonical equilibrium distribution f'^{z) ~ e'^^^^^^Qx, where 
Qx = J dz e~^^^^^^ is a partition function. These conditions are satisfied by several dynam- 
ics such as Hamilton's equations, Langevin dynamics, and the Andersen and Nose-Hoover 
thermostats. 

We are interested in driven nonequilibrium processes in which the system is first prepared 
in equilibrium with A = A(0) and temperature after which the external parameters 
are switched according to the protocol A = X(t). Each realization of this process can be 
described by the trajectory, Z = z{t). The phase space density f{z,t) of an ensemble of 
such trajectories evolves according to a Liouville-type equation, 

^^ = Cx,yf{z,t), (2) 

As the dynamics preserve the canonical distribution when A is held constant, the operator 
Cx has the property Cx ■ e"^^^^''^ = O.-i^ 

Hummer and Szabo recognized that the Feynman-Kac theorem provides a solution to the 
"sink equation" , 

dg{z, t) ^ ^^^^^ _ ^^^^ ^ ^^^^ ^^^^^^ 



dt 

where w{z,t) = —f3 (^—^^f^^, as a path-integral 
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g{z,t) = (|5(^-z(t))e^o'^--W-)'-)^^. (4) 

We remind the reader that the angled brackets denote a path-ensemble average, or 

expectation, over all possible realizations of the described driven nonequilibrium process. 

Another solution to Eq. [3] is given by an improperly normalized Boltzmann distribution, 
Q\{o)^~^^^'^^^^^^ • Equating this solution to that from the Feynman-Kac theorem immediately 



gives, 




1 e-^^M*)W = (5(^-^(t))e-/^^*) 



A ' 



(5) 



in which the work Wt = Wt{Z\A) is defined as, 



Wt 







ds 



ds 



) 



(6) 



By integrating both sides over z, we obtain 



Q\{t) 

Qx{0) 




(7) 



B. Protocol Postprocessing 

In the protocol postprocessing strategy, trajectories are first generated according to the 
sampling protocol A = X(t). Next, a potentially distinct analysis protocol A* = A*(t), 
with A*(0) = A(0), is introduced. This analysis protocol is not used to generate any new 
trajectories. Rather, the previously generated trajectories are used as samples for estimating 
the free energy difference Fa* = F{X*{t)) — F(A*(0)). The standard form of Jarzynski's 
equality can be seen as a special case where the sampling and analysis protocols are identical. 
While the formalism described below is valid for any A*, it will not always be advantageous. 
In Section HVl however, we will describe how to choose a A* that leads to an efficient free 
energy estimate. 

We begin the derivation by formally separating the evolution operator into two terms. 



where the auxiliary operator A{t) represents the difference between the evolution operators 
given the sampling and analysis protocols. 

Now consider a sink equation analogous to Eq. |3l 



where the function w*{z,t) not only includes a time-derivative of the Hamiltonian, but also 



(8) 




(9) 
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a term with the operator A{t), 



Here, the operator v4.(t) only acts on the term e ^^^*(')(^) in the numerator. One solution to 



Eq. lis g{z,t) = Q^Le-^^^'wi-) . 



By equating this solution to the path integral solution obtained from the Feynman-Kac 
theorem, we obtain an equation analogous to Eq. (5) 

l^H*(--^-W)e--->,. (11) 
where the work W^* = Wl{Z\A*) has the modified form, 

~ Jo \ ds + i3e-^H,.a<s)) J ■ ^^^^ 
Integrating over z, we obtain a protocol postprocessing form of Jarzynski's equality. 



e t 



(e-^^^A- (13) 



Again, the angled brackets denote a path-ensemble average, or expectation, over all 

possible realizations of the driven nonequilibrium process with the protocol A = A(t); the 
protocol A* = X*{t) has nothing to do with sampling. 

To be more concrete, let us consider a system moving with overdamped Langevin (Brow- 
nian) dynamics in a one-dimensional potential Ux(^t){z)- The density f{z, t) evolves according 
to the Smoluchowski equation, 

f = A,.,/ = i|K,oW/) + ^^/. (14) 

where = P( is the diffusion coefficient and the prime symbol represents a derivative 
with respect to z. 

Given an analysis protocol A* = X*{t), the auxiliary operator A{t) for this example 
system is defined as, 

A{t).f^~/3D^JAU'{z,t)f), (15) 
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where 

AU{z,t) = Ux*^t){z)-Uxit){z). (16) 
Substituting this expression into Eq. [121 we obtain a modified form of the work, 
'dUx*isMs)) PD§-^ (Af/'(z(s), ,)e-/'^^*w{^W)) ' 



ds (=-PUx*(s){^{s)) 



= j^ds (^1^11^^ + p^DAU'{z{s), s)U'^,^,){z{s)) - PDAW'izis), s)^ , (17) 

Using this expression for in Eq. [121 we can now estimate the free energy difference 
F\*(^t) — -^A(o) from trajectories generated in the process in which external parameter is 
switched according to the protocol A = A(t). 

For dimensions indexed by a, the Smoluchowski equation is, 



where = (3(a is the diffusion coefficient in dimension a. Following steps analogous to 
those above, we obtain the modified work. 



dUx*,..) .o ^ ^ dAU dUx*(s) . A „ d^AU 



where all U are implicitly functions of the position {xa'{s)} at time s, and AU is also a 
function of s. 



III. IMPORTANCE SAMPLING FORMALISM 

Section IllBI is not the first description of protocol postprocessing; it was preceded by a 
formalism based on importance sampling.— In this section, we describe the previous formal- 
ism in the current notation and compare it with the present results. 

Explicitly in terms of path integrals, we may rewrite Eq. [1] as, 

Idzp^4z] ^'"^ 

where = Jq ds ^^^^IMififll^ denotes the work performed on the system as it evolves 



along a particular trajectory in which the external parameter is changed according to the 
protocol A*, pi^*[Z] is the probability density associated with the trajectory Z, and dZ is a 
metric over paths. 

Now suppose that the external parameter is changed according to the protocol \{t) for 
which the associated probability density of a trajectory Z is p\{Z). The same free energy 
difference may be computed by estimating different path integrals,— 

JdZe-^-^.-(£af)pA[Z] _ (.e-^»-.), 

where r = pA* [Z]/ Pa[Z] is the ratio of densities. If the two protocols sampling are equivalent, 
then r = 1. 

This expression differs from Eq. [T3] in that it includes two expectations, the definitions 
of work are different, and it requires a ratio of probabilities, r. The ratio is different from 
a "modification" to the work term. For example, in overdamped Langevin dynamics, this 
ratio is,— 1^ 



r = exp 



Now suppose that we break down Wj* in Eq. [T71 into one term with and a "modification" 
term. If we multiply this modification term by — /3 and take the exponent, we obtain a term 
which is used similarly to r. 



exp 



-/3 / ds {^DAU'{z{s),s)U',,^,){z{s)) - DAU"{z{s),s)) 



(23) 



but is quite distinct. 

For multiple dimensions of overdamped Langevin dynamics, the ratio is. 



r = exp 



where again, AU are implicitly functions of {xa'{s)} and time s. As in the ID case, this 
expression is not equivalent to the Feynman-Kac estimator. 

In later sections, we will describe several advantages of the new formalism. 



IV. DISSIPATION AND LAG 



As protocol postprocessing is merely another mathematical formalism for computing free 
energies, there is no a priori reason to expect that it will perform any better or worse than 
the usual nonequilibrium work estimator, Eq. [1] For clever choices of the analysis protocol, 
however, we can show that Eq. [13] leads to a highly efficient estimator for Fa* . 

A. Exactly solved models 

Suppose that we construct a "perfect" analysis protocol \*(t) whose instantaneous equi- 
librium density is equivalent to the nonequilibrium density, so that f{z,t) = f^'i^^^{z), where 
f'^{z) = Q^^e^^^^^^^ = e~^(^^(^)~^^) denotes the equilibrium distribution corresponding to 
and A. When a perfect analysis protocol is used, then 

W: = Fa* (25) 

for every trajectory! This may be seen by first substituting f{z,t) = e ^^^^'w^^^ ^k'^ in the 
evolution equation, 

= at) ■ fiz, t) = ■ fiz, t) + Ait) ■ fiz, t) (26) 
where we have used Eq. [HI Since Cx*(t) ■ f{z,t) = for this f{z,t), we obtain, 

_ ^ ^Hx*it){z) _ ^_pH,,^,^{z)-F,* ^ ^^^^ _ 

dF^ ^ dH,,^,^{z) ^(t).e-/^(^^-w(-)) 
dt dt ^ ^^') 

By substituting this into the modified work, Eq. [12] and integrating, we obtain Eq. [25] As 
this equation is valid for every trajectory, Eq. [13] is a zero variance estimator of Fa*. As a 
demonstration of this principle, consider two exactly solved^ models: a Brownian particle 
in a harmonic oscillator that either (i) has its center moving at a constant velocity, or (ii) 
has a time-dependent natural frequency. In both cases, the potential has the general time- 
dependent form Ux{t){.z) = ^{z — z{t))'^ where the vector X{t) = {k{t), z{t)} denotes the set 
of external parameters. The Smoluchowski equation describing the evolution of the phase 



space density f{z,t) can be solved to give^ii^S 

/(M) = /^.^<-""''. (28) 

where zt = (z), /ct(^) = ~ (-2)^)5 (...) denotes an average over the distribution 

f{z, t). In case (i), the spring coefficient k{t) is held fixed at k while z{t) is switched according 
to z{t) = vt (A(t) = {k,vt}). In this case, the free energy difference is always zero and krit) 
is a constant, k. The most typical path is, 

zr{t)=vt-^{l-e-^^'''). (29) 

In case (ii), z{t) is held fixed at z{t) = and the spring coefficient k{t) is switched according 
to k{t) = vt (A(t) = {ft, 0}). In this case, zxit) = 0, and 

krit) = (30) 



In either case, we may choose the analysis protocol A*(t) = {kxit), ZT(t)} such that 
U\*(t)iz) = ^2^(2; — ZT{t)y. With this choice, the Boltzmann distribution corresponding to 
the analysis protocol is equal to f{z,t). Hence, the modified work calculated from Eq. [TTlis 
always equal to the free energy difference Fa*. In contrast, the importance sampling form 
of protocol postprocessing yields different work values for each trajectory. 



B. Dissipation Bounds Lag 

In general, it is not feasible to find a perfect analysis protocol. Indeed, in most cases, the 
nonequilibrium densities f{z,t) will not belong to the family of equilibrium distributions 
indexed by A, /^^. However, Eq. [25] suggests that efficient estimators of free energy energies 
can be obtained if we can find an analysis protocol A*(t) such that f^,(^t^{z) closely resembles 
the nonequilibrium density f{z, t). In the following paragraphs, we will make this argument 
more rigorous. 

The convergence of the protocol postprocessing form of Jarzynski's equality will depend 
on a criterion analogous to that in the original form: obtaining trajectories in which the 
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modified work, W*, is less than the free energy difference, Fa*.-*^ Chances of obtaining such 
trajectories are improved when the average dissipation, = (VVj*)a — -^a*, is smalL-ii^ This 
dissipation can be related to an information theoretic measure of overlap between the distri- 
butions f{z, t) describing the state of the system and the equilibrium state corresponding to 
the A*(t), To obtain this relation, we note that the properties of the delta function 

enable the path-ensemble average in Eq. [11] to be written as, 

{5{z - .(t))e-^^' >A = (^(^ - ^W))a (^"'^' >A;(.,t) ' (31) 

where the double subscript (•••)a-(2 indicates a path-ensemble average for trajectories driven 
with the protocol A and which pass through z at time t. Since the nonequilibrium density 
at time t is f{z,t) = {6{z — we may rearrange Eq. [TT]to obtain, 

- '"''"^ (32) 



As in Ref.— , we then take the logarithm of both sides of the equation, invoke Jensen's 
inequality, multiply both sides by fx*(t){z), and integrate over z. Our final result is. 



(>V;)a-Fa. >r'y" dzf{z,t)\n 



= r^I)[/(z,t)||/,^!(,)(z)], (33) 

where D[f{z, t) Wf^t^^^ {z)] is the Kullback-Leibler divergence, or relative entropy, between the 
nonequilibrium density and the equilibrium density corresponding to the analysis protocol. 
The relative entropy is zero when two distributions are identical and grows larger when they 
diverge^i. Eq. [33] suggests, but does not prove (the inequality goes the wrong way), that 
a reasonable strategy for reducing dissipation and improving the convergence of the free 
energy estimator is to choose an analysis protocol in which the "analysis" density closely 
resembles the evolving state of the system. 



V. GENERAL CASE 



Based on the results in Section [TV] we speculate that a reasonable strategy for minimizing 
dissipation and improving the efficiency of the free energy estimator is to choose an analysis 
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protocol A* = \*(t) so that the Kullback-Leibler divergence D[f{z,t)\\fl'i^^-^{z)] is small for 
all t. Obtaining such a protocol will usually entail a search over the space of A to find 
an equilibrium distribution fx''{z) which is similar to f{z,t). While the nonequilibrium 
distribution is not analytically tractable for most systems, it is possible to use sampled 
trajectories to compare the relative entropy between f{z, t) and f^iz) for different values of 
A. Specifically, given a set of trajectories {Zi, Z2, Z]^J\ and several candidate values of A, 
the relative entropy D[f{z, t) \\fx''{^)] is minimized by the parameter vector A that minimizes 
{Hx{z)) f(^z,t) — F\, which may be estimated by the sample average,— 



where Zn{t) denotes the state of system in phase space at time t as it evolves along the 
trajectory Z„. (Note that in D[f{z,t)\\f'^{z)], the integral / dz f{z,t)\nf{z,t) does not 
depend on A.) A reasonable choice for the search space of A is the range of the sampling 
protocol A. This choice has the advantage that Fx — -Fa(o) may be estimated via Jarzynski's 
equality; for distributions that are not accessed during the sampling protocol, it may be 
more difficult to estimate corresponding free energies. 

As noted in Section [III the flexibility in choosing A* means that the free energy F\* may 
be different from F\^. Indeed, unless there is no lag, an analysis protocol which minimizes 
the lag will always have different states than the sampling protocol. Since we are typically 
interested in free energies between the end states of the sampling protocol {A = A(0) and 
B = A(T)), this discrepancy was addressed by introducing an adaptive algorithm, nonequi- 
librium density-dependent sampling (NEDDS).— NEDDS is equally applicable to the current 
formalism. 

In brief, NEDDS entails running all Ng desired simulations of the nonequilibrium pro- 
cess simultaneously. The sampling protocol initially involves an interpolation between the 
desired end states A and B. After reaching state B, the protocol extrapolates past it until 
an adaptively determined stopping time. (While such an extrapolation may not always be 
physically meaningful, it is nearly always computationally feasible.) Without loss of gener- 
ality, let us assume that A < B. The stopping time is decided by performing the following 
calculations while the simulations are in progress: 

1. The free energy difference, Fx(t) — -^a(o)) between the initial and instantaneous state at 




(34) 
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the current time step, t, is estimated using Eq. [T] 

2. -D^est is evaluated with A values from the current state and all preceding states using 
Eq. m 

3. If the choice of A that minimizes Dxest, A™", is between A and B, A < A™" < B, then 
it is appended to the analysis protocol, X*(t) = A™". Otherwise, if it is at or beyond 
B, A™" > B, then the final value of the analysis protocol is set to B, \*{t) = B. 

4. Lastly, is incremented and Fa* is evaluated by protocol postprocessing. 

This procedure ensures that protocol postprocessing estimators can compute the free energy 
difference between the states A and B. 

VI. MODEL SYSTEMS 

We now demonstrate NEDDS with protocol postprocessing (both importance sampling 
and Feynman-Kac) formalisms and compare its efficiency to standard sample mean estimates 
from Jarzynski's equality, Eq. [H on a few toy model systems. First, consider an overdamped 
Brownian particle evolving on an one- dimensional surface, U{z, X) = — IGXz"^, as studied 
by Sun.— In this system, the free energy difference between the states A = and A = 1 at 
/3 = 1 was analytically found to be F\=i — F\=o = —62.9407.— 

As described,— simulations of nonequilibrium driven processes were performed in which 
A was switched between and 1 according to the equation of motion, 

zj^^ = Zj - DAtUj + V2DAtRj, (35) 

where zj is the position at time jAt, D = 1 is the diffusion coefficient. At = 0.0001 is the 
time step, and Rj is a standard normal random variable. A was incremented at each time 
step by vAt. NEDDS was used to obtain the analysis protocol A*(t) concluding at A*(t) = 1, 
and the free energy difference -Fa=i ~ -^a=o was computed using either Eq. [T3]or Eq. [211 For 
comparison, the standard Jarzynski estimate was applied to two types of simulations taking 
the same amount of simulation time as the analysis protocol obtained from NEDDS: either 
(i) A was switched between and 1 at a slower velocity, or (ii) the NEDDS analysis protocol 
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was used as a new sampling protocol. 

While the importance sampling formalism was found to be an improvement over the 
standard form of Jarzynski's equality,— we find that the estimator based on Eq. [12] is even 
better (Fig. Even for the fastest switching rates, where dissipation is expected to be 
high, the systematic bias is largely eliminated. No benefit was found from using the analysis 
protocol from NEDDS as a new sampling protocol; in fact, the bias was worse than with 
the constant velocity protocol. 

We also performed similar tests on another one-dimensional surface, U{z, A) = (52"^ — 102+ 
3) z + ~ ^(t)y : first described by Hummer.— Hummer's surface, a double well potential 
that includes a harmonic bias, mimics the setup of a single- molecule pulling experiment, and 
hence has been used to demonstrate estimators of free energies^"^ and other quantities^ in 
the context of these experiments. The simulations were performed using the same equation 
of motion, diffusion coefficient, and time step as described above for Sun's system. A was 
switched between -1.5 and 1.5. 

The performance trends with Hummer's system are similar to those with Sun's (Fig. |3]). 
Results from the standard form of Jarzynski's equality are more biased than with NEDDS 
and the importance sampling formalism, which in turn is more biased than the Feynman- 
Kac formalism. In contrast to Sun's system, however, the estimates from Eq. f|T3|) are 
noticeably biased at the fastest switching rates. Another distinction between the trends 
from the two systems is that results from using a constant velocity protocol and using the 
analysis protocol as a new sampling protocol are rather similar. 

As a final demonstration, we consider a two-dimensional surface, 

15 15 

U{x, y, A) = 5{x^ - i f + 5(x - y)^ + y (x + cos(7rA))2 + y (y + 1 - sin(27rA) - 2Xf, (36) 

in which A dictates the progress of a harmonic bias along a curve (Fig. S]). Simulations were 
performed as with the ID potentials, using Eq. [35] along each dimension, as well as the same 
diffusion coefficient and time step. A was switched between and 1. 

The performance trends in this system are the same as in Hummer's system (Fig. [5]). 
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VII. DISCUSSION AND CONCLUSION 



We have presented a method for analyzing nonequihbrium trajectories which borrows 
from a similar philosophy as previous work— but is based on a distinct mathematical formal- 
ism. The new formalism has the advantages that it analytically is a zero- variance estimator 
if a "perfect" analysis protocol is obtained, and it improves the convergence of free energy 
estimates in all our tested model systems. Further tests on more complex multidimensional 
systems potential future research direction. 

We expect that protocol postprocessing will be most useful when (i) there is little phase 
space overlap between the end states of interest (otherwise free energy differences can be 
computed without nonequihbrium work identities), (ii) estimates of AF from the nonequi- 
hbrium work relation suffer from poor convergence for a given nonequilbrium process in 
which the system is driven between the end states of interest and (iii) it is reasonable to 
speculate that the nonequihbrium driven process has a nonequihbrium density f{z,t) that 
always resembles an equilibrium density /^'^ parameterized by a A vector along the protocol. 
Exact convergence properties, of course, will depend on the system. 

The observed convergence benefits hint that many improved sampling and analysis algo- 
rithms based on nonequihbrium driven processes still remain to be discovered. 
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FIGURE CAPTIONS 



FIG. 1. Lag in driven nonequilibrium processes. Consider a system driven from state A to 
state B in a. finite-time process. In the above schematic, the ovals represent regions of phase 
space. The darkly shaded ovals are regions of phase space containing most of the density 
/^^^^ of the equilibrium state corresponding to the value of the external parameter at time 
t. The unshaded ovals denote the phase space regions containing most of the density ft 
actually accessed by the system during the process. In a reversible process, the two would 
be indistinguishable. Since the system is driven out of equilibrium, however, a lag builds 
up between ft and f^J^y This lag is correlated to dissipation and is ultimately responsible 
for the poor convergence of free energy estimates based on nonequilibrium processes. If one 
is able to obtain a function A* = X*{t) with A*(0) = A such that the equilibrium states 
/^?|.^^ are closer to the ft (e.g. the lightly shaded ovals), then the convergence of free energy 
estimates may be improved using Eq. [131 

FIG. 2. Comparison of free energy estimates for Sun's system: NEDDS simulations were 
analyzed with importance sampling, Eq. [2T] (circles), or the Feynman-Kac formalism, Eq. 
[13] (triangles). Standard Jarzynski estimates, Eq. [1] (squares), were performed on slower 
simulations with the same total time as the NEDDS simulations or by using the analysis 
protocol as a new sampling protocol (diamonds). The symbols indicate the mean and error 
bars indicate the standard deviation of 10000 estimates, each based on 50 trajectories. The 
simulation time step was At = 0.001 and the rate v indicates that A was incremented by 
f At at each time step of the NEDDS simulations. While the switching rates are equivalent, 
some plots are slightly offset to prevent error bar overlap. The exact free energy is shown 
shaded line. 

FIG. 3. Comparison of free energy estimates for Hummer's system. The caption for Fig. [2] 
applies here, except that the potential is Hummer's rather than Sun's and each estimate is 
based on 250 trajectories. 
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FIG. 4. Potential energy surface for a 2D system. The contour plot is of 5(x^ — 1)^ + 5(x — y)^ 
and the red line traces the equilibrium position of the harmonic bias Y(^ + cos(7rA))^ + ^(1/ + 
1 — sin(27rA) — 2A)^ as A goes from (left) to 1 (right). 

FIG. 5. Comparison of free energy estimates for a 2D system. The caption for Fig. [2] 
applies here, except that the potential is Eq. [36] rather than Sun's system, each estimate 
is based on 250 trajectories, and multidimensional versions of the importance sampling and 
Feynman-Kac formalisms were used. 
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FIG. 1. Lag in driven nonequilibrium processes. Consider a system driven from state A to state 
B in a finite-time process. In the above schematic, the ovals represent regions of phase space. 
The darkly shaded ovals are regions of phase space containing most of the density /^^^-j of the 
equilibrium state corresponding to the value of the external parameter at time t. The unshaded 
ovals denote the phase space regions containing most of the density ft actually accessed by the 
system during the process. In a reversible process, the two would be indistinguishable. Since 
the system is driven out of equilibrium, however, a lag builds up between ft and fxit)' This lag 
is correlated to dissipation and is ultimately responsible for the poor convergence of free energy 
estimates based on nonequilibrium processes. If one is able to obtain a function A* = X*{t) with 
A*(0) = A such that the equilibrium states are closer to the ft (e.g. the lightly shaded ovals), 

then the convergence of free energy estimates may be improved using Eq. [T3j 
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FIG. 2. Comparison of free energy estimates for Sun's system: NEDDS simulations were analyzed 
with importance sampling, Eq. [21] (circles), or the Feynman-Kac formalism, Eq. [13] (triangles). 
Standard Jarzynski estimates, Eq. [1] (squares), were performed on slower simulations with the same 
total time as the NEDDS simulations or by using the analysis protocol as a new sampling protocol 
(diamonds). The symbols indicate the mean and error bars indicate the standard deviation of 
10000 estimates, each based on 50 trajectories. The simulation time step was At = 0.001 and 
the rate v indicates that A was incremented by vAt at each time step of the NEDDS simulations. 
While the switching rates are equivalent, some plots are slightly offset to prevent error bar overlap. 
The exact free energy is shown as a shaded line. 



35r— ^ 




Switching rate, v 

FIG. 3. Comparison of free energy estimates for Hummer's system. The caption for Fig. [2] applies 
here, except that the potential is Hummer's rather than Sun's and each estimate is based on 250 
trajectories. 
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FIG. 4. Potential energy surface for a 2D system. The contour plot is of 5(x^ — 1)^ -|- 5(x — y)^ 
and the red line traces the equilibrium position of the harmonic bias ^(x -|- cos(7rA))^ -|- ^(y -|- 1 — 
sin(27rA) — 2A)^ as A goes from (left) to 1 (right). 
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FIG. 5. Comparison of free energy estimates for a 2D system. The caption for Fig. [2] applies 
here, except that the potential is Eq. [36] rather than Sun's system, each estimate is based on 
250 trajectories, and multidimensional versions of the importance sampling and Feynman-Kac 
formalisms were used. 
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